NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data

https://doi.org/10.1145/3658644.3690354

Zhou, Andy; Xu, Xiaojun; Raghunathan, Ramesh; Lal, Alok; Guan, Xinze; Yu, Bin; Li, Bo (December 2024, ACM)

Graph-based anomaly detection is pivotal in diverse security applications, such as fraud detection in transaction networks and intrusion detection for network traffic. Standard approaches, including Graph Neural Networks (GNNs), often struggle to generalize across shifting data distributions. For instance, we observe that a real-world eBay transaction dataset revealed an over 50% decline in fraud detection accuracy when adding data from only a single new day to the graph due to data distribution shifts. This highlights a critical vulnerability in purely data-driven approaches. Meanwhile, real-world domain knowledge, such as "simultaneous transactions in two locations are suspicious," is more stable and a common existing component of real-world detection strategies. To explicitly integrate such knowledge into data-driven models such as GCNs, we propose KnowGraph, which integrates domain knowledge with data-driven learning for enhanced graph-based anomaly detection. KnowGraph comprises two principal components: (1) a statistical learning component that utilizes a main model for the overarching detection task, augmented by multiple specialized knowledge models that predict domain-specific semantic entities; (2) a reasoning component that employs probabilistic graphical models to execute logical inferences based on model outputs, encoding domain knowledge through weighted first-order logic formulas. In addition, KnowGraph has leveraged the Predictability-Computability-Stability (PCS) framework for veridical data science to estimate and mitigate prediction uncertainties. Empirically, KnowGraph has been rigorously evaluated on two significant real-world scenarios: collusion detection in the online marketplace eBay and intrusion detection within enterprise networks. Extensive experiments on these large-scale real-world datasets show that KnowGraph consistently outperforms state-of-the-art baselines in both transductive and inductive settings, achieving substantial gains in average precision when generalizing to completely unseen test graphs. Further ablation studies demonstrate the effectiveness of the proposed reasoning component in improving detection performance, especially under extreme class imbalance. These results highlight the potential of integrating domain knowledge into data-driven models for high-stakes, graph-based security applications.
more » « less
Full Text Available
Rethinking machine unlearning for large language models

https://doi.org/10.1038/s42256-025-00985-0

Liu, Sijia; Yao, Yuanshun; Jia, Jinghan; Casper, Stephen; Baracaldo, Nathalie; Hase, Peter; Yao, Yuguang; Liu, Chris Yuhao; Xu, Xiaojun; Li, Hang; et al (February 2025, Nature Machine Intelligence)

Full Text Available
POSTER: Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

https://doi.org/10.1145/3634737.3659430

Sahabandu, Dinuka; Xu, Xiaojun; Rajabi, Arezoo; Niu, Luyao; Ramasubramanian, Bhaskar; Li, Bo; Poovendran, Radha (July 2024, ACM)

Full Text Available
Effective and Efficient Federated Tree Learning on Hybrid Data

Li, Qinbin; Xie, Chulin; Xu, Xiaojun; Liu, Xiaoyuan; Zhang, Ce; Li, Bo; He, Bingsheng; Song, Dawn (May 2024, In Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024))

Full Text Available
How to Cover up Anomalous Accesses to Electronic Health Records

Xu, Xiaojun; Hao, Qingying; Yang, Zhuolin; Li, Bo; Liebovitz, David; Wang, Gang; Gunter, Carl (January 2023, The 32nd USENIX Security Symposium (USENIX Security))

Full Text Available
On the Certified Robustness for Ensemble Models and Beyond

Yang, Zhuolin; Li, Linyi; Xu, Xiaojun; Kailkhura, Bhavya; Xie, Tao; Li, Bo (January 2022, International Conference on Learning Representations)

Full Text Available
Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization

Xu, Xiaojun; Zhang, Jacky Y; Ma, Evelyn; Son, Hyun Ho; Koyejo, Sanmi; Li, Bo (January 2022, Proceedings of Machine Learning Research)
Chaudhuri, Kamalika; Jegelka, Stefanie; Song, Le; Szepesvari, Csaba; Niu, Gang; Sabato, Sivan (Ed.)
Machine learning (ML) robustness and domain generalization are fundamentally correlated: they essentially concern data distribution shifts under adversarial and natural settings, respectively. On one hand, recent studies show that more robust (adversarially trained) models are more generalizable. On the other hand, there is a lack of theoretical understanding of their fundamental connections. In this paper, we explore the relationship between regularization and domain transferability considering different factors such as norm regularization and data augmentations (DA). We propose a general theoretical framework proving that factors involving the model function class regularization are sufficient conditions for relative domain transferability. Our analysis implies that “robustness" is neither necessary nor sufficient for transferability; rather, regularization is a more fundamental perspective for understanding domain transferability. We then discuss popular DA protocols (including adversarial training) and show when they can be viewed as the function class regularization under certain conditions and therefore improve generalization. We conduct extensive experiments to verify our theoretical findings and show several counterexamples where robustness and generalization are negatively correlated on different datasets.
more » « less
Full Text Available
TSS: Transformation-Specific Smoothing for Robustness Certification

https://doi.org/10.1145/3460120.3485258

Li, Linyi; Weber, Maurice; Xu, Xiaojun; Rimanic, Luka; Kailkhura, Bhavya; Xie, Tao; Zhang, Ce; Li, Bo (November 2021, Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS'21))

As machine learning (ML) systems become pervasive, safeguarding their security is critical. However, recently it has been demonstrated that motivated adversaries are able to mislead ML systems by perturbing test data using semantic transformations. While there exists a rich body of research providing provable robustness guarantees for ML models against ℓp norm bounded adversarial perturbations, guarantees against semantic perturbations remain largely underexplored. In this paper, we provide TSS -- a unified framework for certifying ML robustness against general adversarial semantic transformations. First, depending on the properties of each transformation, we divide common transformations into two categories, namely resolvable (e.g., Gaussian blur) and differentially resolvable (e.g., rotation) transformations. For the former, we propose transformation-specific randomized smoothing strategies and obtain strong robustness certification. The latter category covers transformations that involve interpolation errors, and we propose a novel approach based on stratified sampling to certify the robustness. Our framework TSS leverages these certification strategies and combines with consistency-enhanced training to provide rigorous certification of robustness. We conduct extensive experiments on over ten types of challenging semantic transformations and show that TSS significantly outperforms the state of the art. Moreover, to the best of our knowledge, TSS is the first approach that achieves nontrivial certified robustness on the large-scale ImageNet dataset. For instance, our framework achieves 30.4% certified robust accuracy against rotation attack (within ±30∘) on ImageNet. Moreover, to consider a broader range of transformations, we show TSS is also robust against adaptive attacks and unforeseen image corruptions such as CIFAR-10-C and ImageNet-C.
more » « less
Full Text Available
Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection

https://doi.org/10.1145/3133956.3134018

Xu, Xiaojun; Liu, Chang; Feng, Qian; Yin, Heng; Song, Le; Song, Dawn (October 2017, Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security)

Full Text Available

Search for: All records